Singapore Dedicated Server Bandwidth (Uplink) | Network latency | Environment monitoring
Xssist™ Group Pte Ltd Singapore Dedicated Servers Client Testimonials Blog Community Frequently Asked Questions Contact Page
Services
Singapore Dedicated Servers
Control Panel System
Control Panel System
Xssist Blog

Server load high: CPU bound

When server is unresponsive, slow, etc, one of the first reactions is to run 'uptime' or 'top'. Most users, and unfortunately, many sysadmins stop there and make the conclusion that the CPU is too slow. Lots of time, in a webhosting environment with control panels, logging, and statistics generation such as awstats, analog, and webanalyzer running, the fault lies in insufficient disk IO. However, sometimes we do see CPU being the bottleneck. How do we know the CPU really is working like mad, and its not waiting for disk IO to complete? Simple, run 'top' or 'iostat' and check the iowait. It should not remain high. How high is high? That is left as an exercise. There are several common scenarios in webhosting and dedicated servers with cPanel, where CPU usage gets high, and the applications do not take the attention they need, and affects the user's experience.

1. Runaway processes

Sometimes, user applications are buggy, and just keep running. Something as simple as while (1){}; if the web site gets just a few hits a second, the server will crawl real soon. Pretty soon, the server goes into a death spiral. First, the CPU runs flat out, then the processes adds up, and there are more and more httpd and php processes which takes up more and more RAM. RAM gets used up, processes starts to swap out to disk; and it goes further down hill from there.

One really cute scenario I encountered along these lines; the user application calls itself. Something like "curl http://usersite/a.php" in a.php itself.

Of course, its possible to limit the number of processes owned by the user, and to limit the total number of httpd processes, amount of memory taken, CPU time etc. Out of scope of this entry though. Only trying to describe CPU bound processes in this entry, so I will just list out keywords that will help if you need to look further: ulimit, /etc/security/limits.conf, httpd.conf

Just chmod 000 those user accounts that are running crazy, and kill -9 the processes. Also crontab -l -u userid to see if there's any cronjobs running.

If you are in a new job, just took over a server and there are processes running which you can't find in /etc/init.d, check /etc/cron.* these are all CentOS paths and filenames.. YMMV, if you are using other favours, that's why we try to stick to one favour of linux (and unix) as far as possible. If you have freebsd, redhat, debian, solaris, tru64, aix etc in one shop.. good for your resume, not too good for your sanity :)

2. Deliberate hacking attempt

Many web applications are unfortunately buggy and easily exploited with exploit scripts distributed around. Shared web servers are easily hacked in some way or other. The guys who hack the site then runs scripts which keep their irc processes alive, port scan, or send out UDP floods. Alot of these scripts are very buggy, worst than the web applications, and they tend to take up 100% of CPU. This makes them easily detected when they show up on 'top'. A few things need to be done; trace which application is vulnerable (modsecurity can be very helpful here, with auditing enabled), disable perl for the nobody user, eg. using setfacl (unfortunately, this breaks some cPanel functions), setup iptables to disallow outgoing traffic to unknown ports, and especially UDP if not required. Again, lots of topics here which is out of scope.

3. Normal processes such as mysql or apache just taking longer and longer to complete as your sites get busier.

my.cnf, the mysql config file, as it comes with cPanel is not good for busy sites. If you run mysqladmin status and see lots of slow queries, you run mysqladmin processlist and see lots of queries that's taking more than 1 second or so to complete, its time to check into my.cnf and see if you can tweak it a bit, usually give it lots of cache. Google for my.cnf optmizations and you should have lots of hits. Also check that your databases are properly indexed. How about apache? a really good thing to do there is eaccelerator.

4. Application upgrade goes wrong

An application such as moodle can go from working smoothly to taking up all your CPU, from one version to another.. due to upgrades going wrong, usually due to differences in the database tables. if you can, its really best to do a brand new installation, and do a migration, rather than upgrade in place.

Lim Wee Cheong
01 Jan 2008

[Sysadmin] Access to servers via mobile device and ssh
[Sysadmin] RAID 0 scaling on SCSI U320, Bonnie++ 1.93c benchmark results
[Sysadmin] TODO (Apr 2007)
[Sysadmin] Recover from mistakes in /etc/fstab or e2label usage
[Sysadmin] Server overloaded?
[Sysadmin] Server load high: CPU bound
[Sysadmin] Smokeping: deluxe latency measurement tool
[Sysadmin] Smokeping
[Sysadmin] Jul 08 to Oct 08 updates
[Sysadmin] Weak link - downtimes caused by the organic being
[Sysadmin] BIOS upgrades - uniflash - hotflash
[Sysadmin] Sizing for Virtual Private Server (VPS) & SSDs
[Sysadmin] iphone, ipod - bluetooth keyboard - Nokia e51
[Sysadmin] e2label, fdisk, /etc/fstab, mount, linux rescue, rescue disk, CentOS
[Sysadmin] opensuse, fix waiting for mandatory device, eth0, eth1, eth2, eth3
[Sysadmin] mount: could not find filesystem '/dev/root'
[Sysadmin] Parallels Virtuozzo Physical server to Container migration (vzp2v)
[Web hosting] DDOS (Distributed Denial of Service)
[Web hosting] Uptime for dedicated server, VPS and shared server
[Web hosting] Shared, Guaranteed and Dedicated Bandwidth
[Web hosting] Unmetered bandwidth
[Web hosting] Free domains?
[Web hosting] Joomla Scalability
[SPAM handling] Tracking applications which are exploited for mass spam mailing
[Buzzwords] Clusters, Clustering
[Security] Destruction of faulty hard disks
[Storage] Benchmark using iometer on linux
[SSD] Benchmark Intel X25-E and Intel X25-M flash SSDs
[SSD] Intel X25-E 64GB G1, 4KB Random IOPS, iometer benchmark
[SSD] Intel X25-M 160GB G2, 4KB Random IOPS, iometer benchmark
[SSD] Comparison of Intel X25-E G1 vs Intel X25-M G2
[cPanel] ClamAV version has reached End of Life! Please upgrade to version 0.95
[cPanel] How to install Java, ImageMagick and ffmpeg
[Perl] Opening text files for reading, and simple regexp (regular expressions)